Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A Rule-Based Approach to Form Mathematical Symbols in Printed Mathematical Expressions

Identifieur interne : 000398 ( Main/Exploration ); précédent : 000397; suivant : 000399

A Rule-Based Approach to Form Mathematical Symbols in Printed Mathematical Expressions

Auteurs : Pavan Kumar [Inde] ; Arun Agarwal [Inde] ; Chakravarthy Bhagvati [Inde]

Source :

RBID : ISTEX:CD55B10CDA8FFBAD75443D3416B48C9ACD46753F

Abstract

Abstract: Automated understanding of mathematical expressions (MEs) is currently a challenging task due to their complex two- dimensional (2D) structure. Recognition of MEs can be online or offline and in either case, the process involves symbol recognition and analysis of 2D structure. This process is more complex for offline or printed MEs as they do not have temporal information. In our present work, we focus on the recognition of printed MEs and assume connected components (ccs) of a given ME image are labelled. Our approach to ME recognition comprises three stages,namely symbol formation, structural analysis and generation of encoding form like LATEX. In this paper, we present symbol formation process, where multi-cc symbols (like =, ≡ etc.) are formed, identity of context-dependent symbols (like a horizontal line can be MINUS, OVERBAR, FRACTION etc.) are resolved using spatial relations. Multi-line MEs like matrices and enumerated functions are also handled in this stage. A rule-based approach is proposed for the purpose, where the heuristics based on spatial relations are represented in the form of rules (knowledge) and those rules are fired depending on input data (labelled ccs). As knowledge is isolated from data like an expert system in our approach, it allows for easy adaptability and extensibility of the process. Proposed approach also handles both single-line and multi-line MEs in an unified manner. Our approach has been tested on around 800 MEs collected from various mathematical documents and experimental results are reported on them.

Url:
DOI: 10.1007/978-3-642-25725-4_16


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct:series">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">A Rule-Based Approach to Form Mathematical Symbols in Printed Mathematical Expressions</title>
<author>
<name sortKey="Kumar, Pavan" sort="Kumar, Pavan" uniqKey="Kumar P" first="Pavan" last="Kumar">Pavan Kumar</name>
</author>
<author>
<name sortKey="Agarwal, Arun" sort="Agarwal, Arun" uniqKey="Agarwal A" first="Arun" last="Agarwal">Arun Agarwal</name>
</author>
<author>
<name sortKey="Bhagvati, Chakravarthy" sort="Bhagvati, Chakravarthy" uniqKey="Bhagvati C" first="Chakravarthy" last="Bhagvati">Chakravarthy Bhagvati</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:CD55B10CDA8FFBAD75443D3416B48C9ACD46753F</idno>
<date when="2011" year="2011">2011</date>
<idno type="doi">10.1007/978-3-642-25725-4_16</idno>
<idno type="url">https://api.istex.fr/document/CD55B10CDA8FFBAD75443D3416B48C9ACD46753F/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001492</idno>
<idno type="wicri:Area/Istex/Curation">001406</idno>
<idno type="wicri:Area/Istex/Checkpoint">000054</idno>
<idno type="wicri:doubleKey">0302-9743:2011:Kumar P:a:rule:based</idno>
<idno type="wicri:Area/Main/Merge">000403</idno>
<idno type="wicri:Area/Main/Curation">000398</idno>
<idno type="wicri:Area/Main/Exploration">000398</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">A Rule-Based Approach to Form Mathematical Symbols in Printed Mathematical Expressions</title>
<author>
<name sortKey="Kumar, Pavan" sort="Kumar, Pavan" uniqKey="Kumar P" first="Pavan" last="Kumar">Pavan Kumar</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Inde</country>
<wicri:regionArea>Dept. of Computer and Information Sciences, University of Hyderabad, 500 046, Hyderabad</wicri:regionArea>
<wicri:noRegion>Hyderabad</wicri:noRegion>
</affiliation>
<affiliation>
<wicri:noCountry code="no comma">E-mail: pavan.ppkumar@gmail.com</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Agarwal, Arun" sort="Agarwal, Arun" uniqKey="Agarwal A" first="Arun" last="Agarwal">Arun Agarwal</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Inde</country>
<wicri:regionArea>Dept. of Computer and Information Sciences, University of Hyderabad, 500 046, Hyderabad</wicri:regionArea>
<wicri:noRegion>Hyderabad</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Inde</country>
</affiliation>
</author>
<author>
<name sortKey="Bhagvati, Chakravarthy" sort="Bhagvati, Chakravarthy" uniqKey="Bhagvati C" first="Chakravarthy" last="Bhagvati">Chakravarthy Bhagvati</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Inde</country>
<wicri:regionArea>Dept. of Computer and Information Sciences, University of Hyderabad, 500 046, Hyderabad</wicri:regionArea>
<wicri:noRegion>Hyderabad</wicri:noRegion>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2011</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">CD55B10CDA8FFBAD75443D3416B48C9ACD46753F</idno>
<idno type="DOI">10.1007/978-3-642-25725-4_16</idno>
<idno type="ChapterID">16</idno>
<idno type="ChapterID">Chap16</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: Automated understanding of mathematical expressions (MEs) is currently a challenging task due to their complex two- dimensional (2D) structure. Recognition of MEs can be online or offline and in either case, the process involves symbol recognition and analysis of 2D structure. This process is more complex for offline or printed MEs as they do not have temporal information. In our present work, we focus on the recognition of printed MEs and assume connected components (ccs) of a given ME image are labelled. Our approach to ME recognition comprises three stages,namely symbol formation, structural analysis and generation of encoding form like LATEX. In this paper, we present symbol formation process, where multi-cc symbols (like =, ≡ etc.) are formed, identity of context-dependent symbols (like a horizontal line can be MINUS, OVERBAR, FRACTION etc.) are resolved using spatial relations. Multi-line MEs like matrices and enumerated functions are also handled in this stage. A rule-based approach is proposed for the purpose, where the heuristics based on spatial relations are represented in the form of rules (knowledge) and those rules are fired depending on input data (labelled ccs). As knowledge is isolated from data like an expert system in our approach, it allows for easy adaptability and extensibility of the process. Proposed approach also handles both single-line and multi-line MEs in an unified manner. Our approach has been tested on around 800 MEs collected from various mathematical documents and experimental results are reported on them.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Inde</li>
</country>
</list>
<tree>
<country name="Inde">
<noRegion>
<name sortKey="Kumar, Pavan" sort="Kumar, Pavan" uniqKey="Kumar P" first="Pavan" last="Kumar">Pavan Kumar</name>
</noRegion>
<name sortKey="Agarwal, Arun" sort="Agarwal, Arun" uniqKey="Agarwal A" first="Arun" last="Agarwal">Arun Agarwal</name>
<name sortKey="Agarwal, Arun" sort="Agarwal, Arun" uniqKey="Agarwal A" first="Arun" last="Agarwal">Arun Agarwal</name>
<name sortKey="Bhagvati, Chakravarthy" sort="Bhagvati, Chakravarthy" uniqKey="Bhagvati C" first="Chakravarthy" last="Bhagvati">Chakravarthy Bhagvati</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000398 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000398 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:CD55B10CDA8FFBAD75443D3416B48C9ACD46753F
   |texte=   A Rule-Based Approach to Form Mathematical Symbols in Printed Mathematical Expressions
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024